Skip to content

Add ReservoirSampling algorithm to randomized module #6204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 7, 2025

Conversation

cureprotocols
Copy link
Contributor

@cureprotocols cureprotocols commented Mar 30, 2025

Algorithm Overview:

  • Efficient for selecting k random elements from a stream of unknown size
  • Commonly used in streaming systems, big data pipelines, and memory-limited environments
  • Time Complexity: O(n)
  • Space Complexity: O(k)

Implementation Details:

  • Class: ReservoirSampling
  • Package: com.thealgorithms.randomized
  • JavaDoc included for class and method
  • Demonstration included in main() method
  • File name and class name follow PascalCase
  • Fully formatted using clang-format

Reference:

Author: Michael Alexander Montoya (@cureprotocols)

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized it.
  • All filenames are in PascalCase.
  • All functions and variable names follow Java naming conventions.
  • All new algorithms have a URL in their comments that points to Wikipedia or other similar explanations.
  • All new code is formatted with clang-format -i --style=file path/to/your/file.java

@DenizAltunkapan
Copy link
Collaborator

What if sampleSize > stream.length? Perhaps there should be error handling for invalid input, the rest lgtm

@cureprotocols
Copy link
Contributor Author

cureprotocols commented Mar 31, 2025

✅ Added input validation for sampleSize > stream.length as requested — thanks for the suggestion, @DenizAltunkapan!

Let me know if anything else is needed — happy to improve further 🙌

Copy link
Member

@siriak siriak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, could you please add some JUnit tests and remove main? You could check that the correct number of elements is returned, that they are all from the initial set, maybe other properties of the algorithm (see https://github.com/TheAlgorithms/Java/tree/master/src/test/java/com/thealgorithms)

@codecov-commenter
Copy link

codecov-commenter commented Apr 3, 2025

Codecov Report

Attention: Patch coverage is 84.61538% with 2 lines in your changes missing coverage. Please review.

Project coverage is 73.78%. Comparing base (2570a99) to head (5341847).

Files with missing lines Patch % Lines
...om/thealgorithms/randomized/ReservoirSampling.java 84.61% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #6204      +/-   ##
============================================
- Coverage     73.78%   73.78%   -0.01%     
- Complexity     5299     5304       +5     
============================================
  Files           671      672       +1     
  Lines         18344    18357      +13     
  Branches       3546     3549       +3     
============================================
+ Hits          13536    13545       +9     
- Misses         4262     4265       +3     
- Partials        546      547       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cureprotocols
Copy link
Contributor Author

✅ Applied clang-format and added JUnit test to src/test/java/com/thealgorithms/randomized/.

All requested changes completed. Ready for final review 💪

siriak
siriak previously approved these changes Apr 6, 2025
Copy link
Member

@siriak siriak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, please fix PR checks and it's ready to merge. Thank you for patience, I'm very busy at the moment so response times are high :)

@cureprotocols
Copy link
Contributor Author

✅ Final test file now formatted with clang-format --style=file
✅ Verified formatting with --dry-run --Werror
✅ Test class is correctly placed under:
src/test/java/com/thealgorithms/randomized/ReservoirSamplingTest.java

CI will re-run shortly. Appreciate all the support — ready for merge when you are 💪

@siriak
Copy link
Member

siriak commented Apr 6, 2025

Error: /home/runner/work/Java/Java/src/main/java/com/thealgorithms/randomized/ReservoirSampling.java:19:1: Utility classes should not have a public or default constructor. [HideUtilityClassConstructor]
Error: /home/runner/work/Java/Java/src/test/java/com/thealgorithms/randomized/ReservoirSamplingTest.java:3:47: Using the '.' form of import should be avoided - org.junit.jupiter.api.Assertions.. [AvoidStarImport]

@cureprotocols
Copy link
Contributor Author

✅ All requested changes are now complete:

  • ✅ Converted ReservoirSampling to final class with private constructor
  • ✅ No wildcard imports (org.junit.jupiter.api.Assertions.* removed)
  • ✅ JUnit test placed in correct directory: src/test/java/com/thealgorithms/randomized
  • ✅ Verified both files pass clang-format --style=file
  • ✅ Confirmed formatting via --dry-run --Werror

Thanks again for your review and feedback. Ready for merge when you are.

@siriak siriak merged commit f53bc00 into TheAlgorithms:master Apr 7, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants